Robust Conversion of CCG Derivations to Phrase Structure Trees
نویسندگان
چکیده
We propose an improved, bottom-up method for converting CCG derivations into PTB-style phrase structure trees. In contrast with past work (Clark and Curran, 2009), which used simple transductions on category pairs, our approach uses richer transductions attached to single categories. Our conversion preserves more sentences under round-trip conversion (51.1% vs. 39.6%) and is more robust. In particular, unlike past methods, ours does not require ad-hoc rules over non-local features, and so can be easily integrated into a parser.
منابع مشابه
A Structural Interpretation of Combinatory Categorial Grammar
This paper gives an interpretation of Combinatory Categorial Grammar derivations in terms of the construction of traditional phrase structure trees. This structural level of representation not only shows how CCG is related to other grammatical investigations, but this paper also uses it to extend CCG in ways which are useful for analyzing and parsing natural language, including a better analysi...
متن کاملHindi CCGbank: CCG Treebank from the Hindi Dependency Treebank
In this paper, we present an approach for automatically creating a Combinatory Categorial Grammar (CCG) treebank from a dependency treebank for the Subject-Object-Verb language Hindi. Rather than a direct conversion from dependency trees to CCG trees, we propose a two stage approach: a language independent generic algorithm first extracts a CCG lexicon from the dependency treebank. A determinis...
متن کاملChinese CCGbank: extracting CCG derivations from the Penn Chinese Treebank
Automated conversion has allowed the development of wide-coverage corpora for a variety of grammar formalisms without the expense of manual annotation. Analysing new languages also tests formalisms, exposing their strengths and weaknesses. We present Chinese CCGbank, a 760,000 word corpus annotated with Combinatory Categorial Grammar (CCG) derivations, induced automatically from the Penn Chines...
متن کاملParsing Noun Phrase Structure with CCG
Statistical parsing of noun phrase (NP) structure has been hampered by a lack of goldstandard data. This is a significant problem for CCGbank, where binary branching NP derivations are often incorrect, a result of the automatic conversion from the Penn Treebank. We correct these errors in CCGbank using a gold-standard corpus of NP structure, resulting in a much more accurate corpus. We also imp...
متن کاملComparing the Accuracy of CCG and Penn Treebank Parsers
We compare the CCG parser of Clark and Curran (2007) with a state-of-the-art Penn Treebank (PTB) parser. An accuracy comparison is performed by converting the CCG derivations into PTB trees. We show that the conversion is extremely difficult to perform, but are able to fairly compare the parsers on a representative subset of the PTB test section, obtaining results for the CCG parser that are st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012